Overview

Dataset statistics

Number of variables10
Number of observations936
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory62.2 KiB
Average record size in memory68.1 B

Variable types

Numeric7
Categorical3

Warnings

title has a high cardinality: 935 distinct values High cardinality
genre has a high cardinality: 200 distinct values High cardinality
director has a high cardinality: 607 distinct values High cardinality
rank is uniformly distributed Uniform
title is uniformly distributed Uniform
director is uniformly distributed Uniform
rank has unique values Unique

Reproduction

Analysis started2021-01-30 16:47:55.542028
Analysis finished2021-01-30 16:48:12.843314
Duration17.3 seconds
Software versionpandas-profiling v2.10.0
Download configurationconfig.yaml

Variables

rank
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct936
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean498.1858974
Minimum1
Maximum1000
Zeros0
Zeros (%)0.0%
Memory size7.4 KiB
2021-01-30T22:18:13.097722image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile53.75
Q1246.75
median496.5
Q3746.25
95-th percentile946.5
Maximum1000
Range999
Interquartile range (IQR)499.5

Descriptive statistics

Standard deviation288.1005611
Coefficient of variation (CV)0.5782993108
Kurtosis-1.202928275
Mean498.1858974
Median Absolute Deviation (MAD)250
Skewness0.009336325751
Sum466302
Variance83001.93332
MonotocityStrictly increasing
2021-01-30T22:18:13.441441image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10001
 
0.1%
3271
 
0.1%
3401
 
0.1%
3391
 
0.1%
3381
 
0.1%
3371
 
0.1%
3351
 
0.1%
3341
 
0.1%
3331
 
0.1%
3321
 
0.1%
Other values (926)926
98.9%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
ValueCountFrequency (%)
10001
0.1%
9991
0.1%
9981
0.1%
9971
0.1%
9961
0.1%

title
Categorical

HIGH CARDINALITY
UNIFORM

Distinct935
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
The Host
 
2
The Invitation
 
1
Divergent
 
1
Big Hero 6
 
1
Relatos salvajes
 
1
Other values (930)
930 

Length

Max length61
Median length13
Mean length14.65384615
Min length2

Characters and Unicode

Total characters13716
Distinct characters79
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique934 ?
Unique (%)99.8%

Sample

1st rowGuardians of the Galaxy
2nd rowPrometheus
3rd rowSplit
4th rowSing
5th rowSuicide Squad
ValueCountFrequency (%)
The Host2
 
0.2%
The Invitation1
 
0.1%
Divergent1
 
0.1%
Big Hero 61
 
0.1%
Relatos salvajes1
 
0.1%
Source Code1
 
0.1%
Disaster Movie1
 
0.1%
Steve Jobs1
 
0.1%
It Follows1
 
0.1%
Superbad1
 
0.1%
Other values (925)925
98.8%
2021-01-30T22:18:14.066402image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
the288
 
11.7%
of91
 
3.7%
a29
 
1.2%
and22
 
0.9%
222
 
0.9%
in19
 
0.8%
15
 
0.6%
to12
 
0.5%
man11
 
0.4%
girl10
 
0.4%
Other values (1348)1940
78.9%

Most occurring characters

ValueCountFrequency (%)
1523
 
11.1%
e1419
 
10.3%
a830
 
6.1%
o811
 
5.9%
n767
 
5.6%
r761
 
5.5%
i727
 
5.3%
t681
 
5.0%
s576
 
4.2%
h510
 
3.7%
Other values (69)5111
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9759
71.2%
Uppercase Letter2137
 
15.6%
Space Separator1523
 
11.1%
Other Punctuation159
 
1.2%
Decimal Number101
 
0.7%
Dash Punctuation29
 
0.2%
Open Punctuation4
 
< 0.1%
Close Punctuation4
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
e1419
14.5%
a830
 
8.5%
o811
 
8.3%
n767
 
7.9%
r761
 
7.8%
i727
 
7.4%
t681
 
7.0%
s576
 
5.9%
h510
 
5.2%
l431
 
4.4%
Other values (21)2246
23.0%
ValueCountFrequency (%)
T329
15.4%
S179
 
8.4%
M137
 
6.4%
B121
 
5.7%
D116
 
5.4%
A106
 
5.0%
P103
 
4.8%
H101
 
4.7%
C100
 
4.7%
L91
 
4.3%
Other values (16)754
35.3%
ValueCountFrequency (%)
234
33.7%
317
16.8%
114
13.9%
013
 
12.9%
46
 
5.9%
55
 
5.0%
63
 
3.0%
93
 
3.0%
83
 
3.0%
73
 
3.0%
ValueCountFrequency (%)
:80
50.3%
'35
22.0%
.21
 
13.2%
,9
 
5.7%
&6
 
3.8%
!4
 
2.5%
?2
 
1.3%
/2
 
1.3%
ValueCountFrequency (%)
1523
100.0%
ValueCountFrequency (%)
-29
100.0%
ValueCountFrequency (%)
(4
100.0%
ValueCountFrequency (%)
)4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11896
86.7%
Common1820
 
13.3%

Most frequent character per script

ValueCountFrequency (%)
e1419
 
11.9%
a830
 
7.0%
o811
 
6.8%
n767
 
6.4%
r761
 
6.4%
i727
 
6.1%
t681
 
5.7%
s576
 
4.8%
h510
 
4.3%
l431
 
3.6%
Other values (47)4383
36.8%
ValueCountFrequency (%)
1523
83.7%
:80
 
4.4%
'35
 
1.9%
234
 
1.9%
-29
 
1.6%
.21
 
1.2%
317
 
0.9%
114
 
0.8%
013
 
0.7%
,9
 
0.5%
Other values (12)45
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII13709
99.9%
None7
 
0.1%

Most frequent character per block

ValueCountFrequency (%)
1523
 
11.1%
e1419
 
10.4%
a830
 
6.1%
o811
 
5.9%
n767
 
5.6%
r761
 
5.6%
i727
 
5.3%
t681
 
5.0%
s576
 
4.2%
h510
 
3.7%
Other values (64)5104
37.2%
ValueCountFrequency (%)
é3
42.9%
è1
 
14.3%
ä1
 
14.3%
í1
 
14.3%
á1
 
14.3%

genre
Categorical

HIGH CARDINALITY

Distinct200
Distinct (%)21.4%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
Action,Adventure,Sci-Fi
 
50
Drama
 
43
Comedy,Drama,Romance
 
32
Comedy
 
30
Drama,Romance
 
28
Other values (195)
753 

Length

Max length26
Median length20
Mean length18.20512821
Min length5

Characters and Unicode

Total characters17040
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique83 ?
Unique (%)8.9%

Sample

1st rowAction,Adventure,Sci-Fi
2nd rowAdventure,Mystery,Sci-Fi
3rd rowHorror,Thriller
4th rowAnimation,Comedy,Family
5th rowAction,Adventure,Fantasy
ValueCountFrequency (%)
Action,Adventure,Sci-Fi50
 
5.3%
Drama43
 
4.6%
Comedy,Drama,Romance32
 
3.4%
Comedy30
 
3.2%
Drama,Romance28
 
3.0%
Action,Adventure,Fantasy26
 
2.8%
Animation,Adventure,Comedy26
 
2.8%
Comedy,Drama25
 
2.7%
Comedy,Romance25
 
2.7%
Crime,Drama,Thriller22
 
2.4%
Other values (190)629
67.2%
2021-01-30T22:18:14.722605image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
action,adventure,sci-fi50
 
5.3%
drama43
 
4.6%
comedy,drama,romance32
 
3.4%
comedy30
 
3.2%
drama,romance28
 
3.0%
animation,adventure,comedy26
 
2.8%
action,adventure,fantasy26
 
2.8%
comedy,romance25
 
2.7%
comedy,drama25
 
2.7%
crime,drama,mystery22
 
2.4%
Other values (190)629
67.2%

Most occurring characters

ValueCountFrequency (%)
r1786
 
10.5%
,1468
 
8.6%
a1459
 
8.6%
e1332
 
7.8%
m1110
 
6.5%
i1103
 
6.5%
o1065
 
6.2%
n865
 
5.1%
t831
 
4.9%
y713
 
4.2%
Other values (21)5308
31.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter12940
75.9%
Uppercase Letter2518
 
14.8%
Other Punctuation1468
 
8.6%
Dash Punctuation114
 
0.7%

Most frequent character per category

ValueCountFrequency (%)
r1786
13.8%
a1459
11.3%
e1332
10.3%
m1110
8.6%
i1103
8.5%
o1065
8.2%
n865
6.7%
t831
 
6.4%
y713
 
5.5%
c555
 
4.3%
Other values (8)2121
16.4%
ValueCountFrequency (%)
A584
23.2%
D474
18.8%
C409
16.2%
F262
10.4%
T183
 
7.3%
H136
 
5.4%
R131
 
5.2%
S130
 
5.2%
M120
 
4.8%
B71
 
2.8%
ValueCountFrequency (%)
,1468
100.0%
ValueCountFrequency (%)
-114
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin15458
90.7%
Common1582
 
9.3%

Most frequent character per script

ValueCountFrequency (%)
r1786
 
11.6%
a1459
 
9.4%
e1332
 
8.6%
m1110
 
7.2%
i1103
 
7.1%
o1065
 
6.9%
n865
 
5.6%
t831
 
5.4%
y713
 
4.6%
A584
 
3.8%
Other values (19)4610
29.8%
ValueCountFrequency (%)
,1468
92.8%
-114
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII17040
100.0%

Most frequent character per block

ValueCountFrequency (%)
r1786
 
10.5%
,1468
 
8.6%
a1459
 
8.6%
e1332
 
7.8%
m1110
 
6.5%
i1103
 
6.5%
o1065
 
6.2%
n865
 
5.1%
t831
 
4.9%
y713
 
4.2%
Other values (21)5308
31.2%

director
Categorical

HIGH CARDINALITY
UNIFORM

Distinct607
Distinct (%)64.9%
Missing0
Missing (%)0.0%
Memory size3.7 KiB
Ridley Scott
 
8
Paul W.S. Anderson
 
6
Michael Bay
 
6
David Yates
 
6
M. Night Shyamalan
 
6
Other values (602)
904 

Length

Max length32
Median length13
Mean length13.13782051
Min length3

Characters and Unicode

Total characters12297
Distinct characters69
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique418 ?
Unique (%)44.7%

Sample

1st rowJames Gunn
2nd rowRidley Scott
3rd rowM. Night Shyamalan
4th rowChristophe Lourdelet
5th rowDavid Ayer
ValueCountFrequency (%)
Ridley Scott8
 
0.9%
Paul W.S. Anderson6
 
0.6%
Michael Bay6
 
0.6%
David Yates6
 
0.6%
M. Night Shyamalan6
 
0.6%
Antoine Fuqua5
 
0.5%
Justin Lin5
 
0.5%
Danny Boyle5
 
0.5%
Denis Villeneuve5
 
0.5%
J.J. Abrams5
 
0.5%
Other values (597)879
93.9%
2021-01-30T22:18:15.316308image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
david35
 
1.8%
john23
 
1.2%
james20
 
1.0%
scott18
 
0.9%
michael18
 
0.9%
paul18
 
0.9%
steven13
 
0.7%
robert12
 
0.6%
ben12
 
0.6%
lee11
 
0.6%
Other values (920)1778
90.8%

Most occurring characters

ValueCountFrequency (%)
e1156
 
9.4%
1022
 
8.3%
a980
 
8.0%
n876
 
7.1%
r829
 
6.7%
o733
 
6.0%
i685
 
5.6%
l569
 
4.6%
t451
 
3.7%
s441
 
3.6%
Other values (59)4555
37.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter9176
74.6%
Uppercase Letter2013
 
16.4%
Space Separator1022
 
8.3%
Other Punctuation67
 
0.5%
Dash Punctuation19
 
0.2%

Most frequent character per category

ValueCountFrequency (%)
e1156
12.6%
a980
10.7%
n876
9.5%
r829
 
9.0%
o733
 
8.0%
i685
 
7.5%
l569
 
6.2%
t451
 
4.9%
s441
 
4.8%
h336
 
3.7%
Other values (28)2120
23.1%
ValueCountFrequency (%)
S196
 
9.7%
J186
 
9.2%
M173
 
8.6%
A141
 
7.0%
D125
 
6.2%
B119
 
5.9%
C117
 
5.8%
G116
 
5.8%
R109
 
5.4%
L99
 
4.9%
Other values (17)632
31.4%
ValueCountFrequency (%)
.65
97.0%
'2
 
3.0%
ValueCountFrequency (%)
1022
100.0%
ValueCountFrequency (%)
-19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin11189
91.0%
Common1108
 
9.0%

Most frequent character per script

ValueCountFrequency (%)
e1156
 
10.3%
a980
 
8.8%
n876
 
7.8%
r829
 
7.4%
o733
 
6.6%
i685
 
6.1%
l569
 
5.1%
t451
 
4.0%
s441
 
3.9%
h336
 
3.0%
Other values (55)4133
36.9%
ValueCountFrequency (%)
1022
92.2%
.65
 
5.9%
-19
 
1.7%
'2
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII12254
99.7%
None43
 
0.3%

Most frequent character per block

ValueCountFrequency (%)
e1156
 
9.4%
1022
 
8.3%
a980
 
8.0%
n876
 
7.1%
r829
 
6.8%
o733
 
6.0%
i685
 
5.6%
l569
 
4.6%
t451
 
3.7%
s441
 
3.6%
Other values (46)4512
36.8%
ValueCountFrequency (%)
é10
23.3%
á9
20.9%
ó4
 
9.3%
ö4
 
9.3%
å4
 
9.3%
ñ3
 
7.0%
ç3
 
7.0%
Ø1
 
2.3%
í1
 
2.3%
ë1
 
2.3%
Other values (3)3
 
7.0%

year
Real number (ℝ≥0)

Distinct11
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.771368
Minimum2006
Maximum2016
Zeros0
Zeros (%)0.0%
Memory size7.4 KiB
2021-01-30T22:18:15.503795image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum2006
5-th percentile2007
Q12010
median2014
Q32016
95-th percentile2016
Maximum2016
Range10
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.178987268
Coefficient of variation (CV)0.001579408034
Kurtosis-0.8070367081
Mean2012.771368
Median Absolute Deviation (MAD)2
Skewness-0.6863119763
Sum1883954
Variance10.10596005
MonotocityNot monotonic
2021-01-30T22:18:15.831901image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2016268
28.6%
2015123
13.1%
201495
 
10.1%
201386
 
9.2%
201262
 
6.6%
201059
 
6.3%
201158
 
6.2%
200949
 
5.2%
200849
 
5.2%
200746
 
4.9%
ValueCountFrequency (%)
200641
4.4%
200746
4.9%
200849
5.2%
200949
5.2%
201059
6.3%
ValueCountFrequency (%)
2016268
28.6%
2015123
13.1%
201495
 
10.1%
201386
 
9.2%
201262
 
6.6%

runtime
Real number (ℝ≥0)

Distinct92
Distinct (%)9.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113.2724359
Minimum66
Maximum187
Zeros0
Zeros (%)0.0%
Memory size7.4 KiB
2021-01-30T22:18:16.160001image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum66
5-th percentile88
Q1100
median111
Q3123
95-th percentile149
Maximum187
Range121
Interquartile range (IQR)23

Descriptive statistics

Standard deviation18.55079827
Coefficient of variation (CV)0.1637715135
Kurtosis0.6336593054
Mean113.2724359
Median Absolute Deviation (MAD)12
Skewness0.7911194262
Sum106023
Variance344.1321164
MonotocityNot monotonic
2021-01-30T22:18:16.566220image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10829
 
3.1%
11726
 
2.8%
10026
 
2.8%
11025
 
2.7%
11825
 
2.7%
10225
 
2.7%
10624
 
2.6%
10422
 
2.4%
11222
 
2.4%
10121
 
2.2%
Other values (82)691
73.8%
ValueCountFrequency (%)
661
 
0.1%
731
 
0.1%
802
0.2%
814
0.4%
821
 
0.1%
ValueCountFrequency (%)
1871
 
0.1%
1802
0.2%
1721
 
0.1%
1701
 
0.1%
1693
0.3%

rating
Real number (ℝ≥0)

Distinct55
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.729166667
Minimum1.9
Maximum9
Zeros0
Zeros (%)0.0%
Memory size7.4 KiB
2021-01-30T22:18:16.956817image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1.9
5-th percentile5.175
Q16.2
median6.8
Q37.4
95-th percentile8.1
Maximum9
Range7.1
Interquartile range (IQR)1.2

Descriptive statistics

Standard deviation0.9352249579
Coefficient of variation (CV)0.1389807987
Kurtosis1.190310556
Mean6.729166667
Median Absolute Deviation (MAD)0.6
Skewness-0.7045209798
Sum6298.5
Variance0.8746457219
MonotocityNot monotonic
2021-01-30T22:18:17.228536image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.747
 
5.0%
744
 
4.7%
7.144
 
4.7%
6.341
 
4.4%
7.340
 
4.3%
7.839
 
4.2%
6.639
 
4.2%
7.239
 
4.2%
6.537
 
4.0%
6.236
 
3.8%
Other values (45)530
56.6%
ValueCountFrequency (%)
1.91
0.1%
2.71
0.1%
3.21
0.1%
3.52
0.2%
3.71
0.1%
ValueCountFrequency (%)
91
 
0.1%
8.81
 
0.1%
8.63
0.3%
8.56
0.6%
8.43
0.3%

votes
Real number (ℝ≥0)

Distinct933
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean175270.2169
Minimum61
Maximum1791916
Zeros0
Zeros (%)0.0%
Memory size7.4 KiB
2021-01-30T22:18:17.494136image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum61
5-th percentile1586.25
Q141593
median114918.5
Q3249538
95-th percentile530938.75
Maximum1791916
Range1791855
Interquartile range (IQR)207945

Descriptive statistics

Standard deviation190582.4207
Coefficient of variation (CV)1.08736341
Kurtosis11.27174861
Mean175270.2169
Median Absolute Deviation (MAD)88404
Skewness2.493379996
Sum164052923
Variance3.632165907 × 1010
MonotocityNot monotonic
2021-01-30T22:18:17.759743image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2912
 
0.2%
971412
 
0.2%
14272
 
0.2%
1256931
 
0.1%
4062191
 
0.1%
2997181
 
0.1%
4615091
 
0.1%
928681
 
0.1%
2403231
 
0.1%
1010581
 
0.1%
Other values (923)923
98.6%
ValueCountFrequency (%)
611
0.1%
1021
0.1%
1151
0.1%
1641
0.1%
1731
0.1%
ValueCountFrequency (%)
17919161
0.1%
15836251
0.1%
12226451
0.1%
10477471
0.1%
10455881
0.1%

revenue
Real number (ℝ≥0)

Distinct790
Distinct (%)84.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean80.75192308
Minimum0
Maximum936.63
Zeros1
Zeros (%)0.1%
Memory size7.4 KiB
2021-01-30T22:18:18.040979image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.32
Q117.4425
median48.15
Q3102.4225
95-th percentile292.075
Maximum936.63
Range936.63
Interquartile range (IQR)84.98

Descriptive statistics

Standard deviation99.51826197
Coefficient of variation (CV)1.232394947
Kurtosis11.9166468
Mean80.75192308
Median Absolute Deviation (MAD)37.275
Skewness2.764949299
Sum75583.8
Variance9903.884465
MonotocityNot monotonic
2021-01-30T22:18:18.322208image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
48.1598
 
10.5%
0.035
 
0.5%
0.044
 
0.4%
0.014
 
0.4%
0.054
 
0.4%
0.324
 
0.4%
0.024
 
0.4%
2.23
 
0.3%
0.543
 
0.3%
0.153
 
0.3%
Other values (780)804
85.9%
ValueCountFrequency (%)
01
 
0.1%
0.014
0.4%
0.024
0.4%
0.035
0.5%
0.044
0.4%
ValueCountFrequency (%)
936.631
0.1%
760.511
0.1%
652.181
0.1%
623.281
0.1%
533.321
0.1%

metascore
Real number (ℝ≥0)

Distinct84
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58.98504274
Minimum11
Maximum100
Zeros0
Zeros (%)0.0%
Memory size7.4 KiB
2021-01-30T22:18:18.603434image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile31
Q147
median59.5
Q372
95-th percentile85
Maximum100
Range89
Interquartile range (IQR)25

Descriptive statistics

Standard deviation17.19475702
Coefficient of variation (CV)0.2915104614
Kurtosis-0.6122051468
Mean58.98504274
Median Absolute Deviation (MAD)12.5
Skewness-0.1238873467
Sum55210
Variance295.6596691
MonotocityNot monotonic
2021-01-30T22:18:18.884669image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6625
 
2.7%
7225
 
2.7%
6825
 
2.7%
6424
 
2.6%
5723
 
2.5%
5122
 
2.4%
6522
 
2.4%
4821
 
2.2%
8121
 
2.2%
7621
 
2.2%
Other values (74)707
75.5%
ValueCountFrequency (%)
111
 
0.1%
151
 
0.1%
161
 
0.1%
184
0.4%
191
 
0.1%
ValueCountFrequency (%)
1001
 
0.1%
991
 
0.1%
981
 
0.1%
964
0.4%
953
0.3%

Interactions

2021-01-30T22:18:00.617036image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:00.907490image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:01.141847image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:01.360580image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:01.610563image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:01.813673image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:02.016789image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:02.235524image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:02.469879image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:02.704219image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:03.001092image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:03.266684image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:03.532284image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:03.797897image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:04.063498image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:04.313484image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:04.610340image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:04.875943image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:05.157173image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:05.422781image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:05.704009image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:05.985235image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:06.282096image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:06.563325image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:06.844551image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:07.110155image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:07.391392image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:07.656994image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:07.938223image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:08.203835image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:08.469440image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:08.719422image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:08.969405image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:09.219382image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:09.891226image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:10.141196image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:10.421728image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:10.656088image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:10.906075image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:11.171678image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:11.421659image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-01-30T22:18:11.702893image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2021-01-30T22:18:19.134648image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-01-30T22:18:19.447123image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-01-30T22:18:19.759607image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-01-30T22:18:20.072085image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-01-30T22:18:12.203791image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-01-30T22:18:12.641260image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

ranktitlegenredirectoryearruntimeratingvotesrevenuemetascore
01Guardians of the GalaxyAction,Adventure,Sci-FiJames Gunn20141218.1757074333.1376.0
12PrometheusAdventure,Mystery,Sci-FiRidley Scott20121247.0485820126.4665.0
23SplitHorror,ThrillerM. Night Shyamalan20161177.3157606138.1262.0
34SingAnimation,Comedy,FamilyChristophe Lourdelet20161087.260545270.3259.0
45Suicide SquadAction,Adventure,FantasyDavid Ayer20161236.2393727325.0240.0
56The Great WallAction,Adventure,FantasyYimou Zhang20161036.15603645.1342.0
67La La LandComedy,Drama,MusicDamien Chazelle20161288.3258682151.0693.0
78MindhornComedySean Foley2016896.4249048.1571.0
89The Lost City of ZAction,Adventure,BiographyJames Gray20161417.171888.0178.0
910PassengersAdventure,Drama,RomanceMorten Tyldum20161167.0192177100.0141.0

Last rows

ranktitlegenredirectoryearruntimeratingvotesrevenuemetascore
926989MartyrsHorrorPascal Laugier2008997.16378548.1589.0
927991Underworld: Rise of the LycansAction,Adventure,FantasyPatrick Tatopoulos2009926.612970845.8044.0
928992Taare Zameen ParDrama,Family,MusicAamir Khan20071658.51026971.2042.0
929994Resident Evil: AfterlifeAction,Adventure,HorrorPaul W.S. Anderson2010975.914090060.1337.0
930995Project XComedyNima Nourizadeh2012886.716408854.7248.0
931996Secret in Their EyesCrime,Drama,MysteryBilly Ray20151116.22758548.1545.0
932997Hostel: Part IIHorrorEli Roth2007945.57315217.5446.0
933998Step Up 2: The StreetsDrama,Music,RomanceJon M. Chu2008986.27069958.0150.0
934999Search PartyAdventure,ComedyScot Armstrong2014935.6488148.1522.0
9351000Nine LivesComedy,Family,FantasyBarry Sonnenfeld2016875.31243519.6411.0